Vision-based robot control system

ABSTRACT

Various systems and methods for providing a vision-based robot control system are provided herein. A vision-based robot control system comprises a camera system interface to receive image data from a camera system; a trigger detection unit to determine a triggering action from the image data; and a transceiver to initiate a robot operation associated with the triggering action.

PRIORITY APPLICATION

This application is a continuation of U.S. application Ser. No.15/185,241, filed Jun. 17, 2016, which is incorporated herein byreference in its entirety.

TECHNICAL FIELD

Embodiments described herein generally relate to robotic controls and inparticular, to a vision-based robot control system.

BACKGROUND

Robots are mechanical or electro-mechanical machines able to act asagents for human operators. Some robots are automated or semi-automatedand able to perform tasks with minimal human input. Robots are used inresidential, industrial, and commercial settings. As electronics andmanufacturing processes scale, robot use is becoming more widespread.

BRIEF DESCRIPTION OF THE DRAWINGS

In the drawings, which are not necessarily drawn to scale, like numeralsmay describe similar components in different views. Like numerals havingdifferent letter suffixes may represent different instances of similarcomponents. Some embodiments are illustrated by way of example, and notlimitation, in the figures of the accompanying drawings in which:

FIG. 1 is a diagram illustrating a robot operating in an environment,according to an embodiment;

FIG. 2 is a flowchart illustrating the data and control flow, accordingto an embodiment;

FIG. 3 is a block diagram illustrating a vision-based robot controlsystem for controlling a robot, according to an embodiment;

FIG. 4 is a flowchart illustrating a method for providing a vision-basedrobot control system, according to an embodiment; and

FIG. 5 is a block diagram illustrating an example machine upon which anyone or more of the techniques (e.g., methodologies) discussed herein mayperform, according to an example embodiment.

DETAILED DESCRIPTION

In the following description, for purposes of explanation, numerousspecific details are set forth in order to provide a thoroughunderstanding of some example embodiments. It will be evident, however,to one skilled in the art that the present disclosure may be practicedwithout these specific details.

Disclosed herein are systems and methods that provide a vision-basedrobot control system. Consumer-level robots are becoming a reality withlower cost manufacturing and more affordable electronics. Robots mayinclude various sensors to detect their environment. Example sensorsinclude proximity sensors, position sensors, impact sensors, and camerasor other image-based sensors. While some robots are able to becontrolled using buttons on an exterior housing, with remote controldevices, or with auxiliary controls, such as controls at a chargingstation, what is needed are more intuitive ways to control robots.

FIG. 1 is a diagram illustrating a robot 100 operating in an environment102, according to an embodiment. The robot 100 may be any type ofself-propelled, mobile machine. The robot 100 may use various apparatusto move, such as wheels, treads, tracks, or the like. The robot 100 mayalso implement one or more legs to hop, walk, crawl, or otherwise move(e.g., fly) about an environment. The robot 100 may include an onboardcamera system, which may include one or more cameras. The camera systemmay include a visual light camera (e.g., an RGB camera), an infrared(IR) light camera, or other cameras. The IR camera may be used for nightvision, as a depth camera, or thermal imagery. In addition to a camerasystem, the robot 100 may also be equipped with other sensors, such as asonar system, radar system, etc. to navigate environments. The robot 100also includes one or more communication systems, which may include awireless networking communication system. The wireless networkingcommunication system may use one or more of a variety of protocols ortechnologies, including Wi-Fi, 3G, and 4G LTE/LTE-A, WiMAX networks,Bluetooth, near field communication (NFC), or the like.

Alternatively, the robot 100 may interface with one or more cameras104A, 104B, 104C, 104D (collectively referred to as 104). The cameras104 may capture a user's movement, actions, or other aspects of the user106 as the user 106 moves about in the environment 102. Cameras 104 inthe environment 100 may include visible light cameras, infrared cameras,or other types of cameras. The cameras 104 may be connected using wiredor wireless connections. In addition, one or more of the cameras 104 mayuse one or more servos for pan and tilt to follow a subject while it iswithin the operating field of view of the camera 104. The camera 104 maytrack a subject using shape recognition or with a physical marker thatthe subject holds or wears and the camera 104 actively tracks. Thephysical marker may be wireless connected to the camera 104 using atechnology such as Bluetooth. Although only four cameras 104 areillustrated in FIG. 1, it is understood that more or fewer cameras maybe implemented based on the size of the environment, obstructions in theenvironment, or other considerations. Combinations of onboard camerasand environmental cameras may be used.

The user 106 may interface with the robot 100 using various intuitivetechniques. In an aspect, the user 106 gestures in a particular manner,which may then be captured by the cameras 104 or the robot 100, or byother environmental sensors. The gesture may provide instruction to therobot 100 based on a preconfigured association. For instance, the user106 may mark a spot on the floor by tapping his foot in a prescribedmanner. The robot 100 may be a cleaning robot, such as semi-automatedvacuum. The tapping may be detected by the user's motion using cameras104, by sound processing (e.g., using a microphone array), by vibration(e.g., by using in-floor sensors to detect vibration location), or byother mechanisms or combinations of mechanisms. After receiving anindication of the gesture performed by the user 106, the robot 100 mayconcentrate in the area indicated by the gesture, such as by performingextra passes or by temporarily slowing down to clean the area morethoroughly.

It is understood that while some embodiments discussed in thisdisclosure include camera and image processing, other modalities ofgesture detect may be used instead of, or in combination with, cameraand image processing techniques. Thus, gesture detection is used toidentify a particular location to perform a robotic action, and as such,in some cases, the gesture is used to intuitively identify the location(e.g., pointing to a spot on the floor, motioning to an area, or tappingthe floor with a foot, etc.).

In another aspect, the user 106 may provide instructions using a markerwith an infrared-detectable ink. The ink is not visible to human eyes,so it does not discolor or mar materials, such as carpet or furnitureupholstery. The robot 100 may be configured with an IR camera, or theenvironmental camera 104 may be an IR camera, which is able to see themarkings left by the user 106. Different markings may be used, such as abox to indicate an extra clean, a circle to instruct the robot 100 touse a different cleaner, or an “X” to avoid an area. Other markings maybe used. In another aspect, the robot 100 may clean the IR-detectableink when cleaning the marked area. As such, the ink marking may act as aone-time instruction. Alternatively, the IR-ink may be left behindpurposefully, so that the mark is able to be observed on subsequentcleanings.

In addition to visual ques, the user 106 may also provide instructionwith other modalities, such as verbally or with geolocation. In anexample, the user 106 may gesture and then speak a verbal command, whichis received at the robot 100 and then acted upon by the robot 100. Forinstance, the user 106 may point to a spot on the floor with a laserpointer and speak “move the shelf over here.” The robot 100 may directlyreceive such commands with an onboard camera system and a microphone.Alternatively, environmental sensors, such as cameras 104 andmicrophones, may detect the user's actions and communicate them ascommands to the robot 100. Upon receiving an actionable instruction, therobot 100 may move to the shelf, lift it, and then move it to thelocation indicated by the user's gesture.

In another aspect, the user 106 may use a geolocation as an additionalinput. For instance, the user 106 may use a mobile device 108 (e.g.,smartphone, tablet, wearable device, etc.) and perform a gesture whileholding or wearing the mobile device. The user 106 may also initiateverbal commands to the mobile device. The location of the mobile device108 (e.g., geolocation) may be obtained and transmitted to the robot 100along with an instruction, as defined by the gesture or verbalinstruction provided by the user 106.

FIG. 2 is a flowchart illustrating the data and control flow, accordingto an embodiment. A triggering action performed by a user is observed bya camera system (operation 200). For instance, a user may perform agesture or leave a mark visible to the robot (e.g., with a token or withan ink visible to the camera system that may or may not be visible tounaided humans). The camera system interprets the action (operation202). The interpretation may include gesture recognition, patternrecognition, shape recognition, or other forms of analysis to determinethe type of triggering action performed by the user. If the triggeringaction is recognized, then additional processing may occur to determinewhether the user provided any other triggering commands (operation 204),such as a verbal command. If a verbal command is recognized, forexample, then the verbal command may be parsed and used in thesubsequent operations. Additionally, geolocations, locations of pointinggestures, or other commands, may be analyzed at operation 204.

A lookup table is referenced (operation 206) to determine whichoperation the robot is to perform based on the trigger input. The typesof trigger input and resulting robot operations may differ according tothe design and function of the robot. For example, a cleaning robot maybe programmed to respond to certain triggering actions, where a securityrobot may be programmed to respond to other triggering actions. In thecase where several robots operate in the same environment, the user maymap the triggering actions to only trigger one of several robots.Alternatively, the user may configure the commands such that a singlecommand may cause multiple robots to perform certain functions.

If the triggering actions map to an action found in the lookup table,then the robot is scheduled to perform the resulting operation(operation 208). The robot may perform the operation immediately or maybe configured to perform the operation at the next duty cycle (e.g., thenext cleaning cycle).

If the triggering actions are not found in the lookup table, then thecontrol flow returns to operation 200, where the system monitors foradditional user triggering actions.

FIG. 3 is a block diagram illustrating a vision-based robot controlsystem 300 for controlling a robot, according to an embodiment. Thesystem 300 may be installed in a robot. Alternatively, the system 300may be separate from a robot, but communicatively coupled to the robotin order provide control signals. The system 300 includes a camerasystem interface 302, a trigger detection unit 304, an operationdatabase interface 306, and a transceiver 308.

The camera system interface 302, trigger detection unit 304, operationdatabase interface 306, and transceiver 308 are understood to encompasstangible entities that are physically constructed, specificallyconfigured (e.g., hardwired), or temporarily (e.g., transitorily)configured (e.g., programmed) to operate in a specified manner or toperform part or all of any operations described herein. Such tangibleentitles may be constructed using one or more circuits, such as withdedicated hardware (e.g., field programmable gate arrays (FPGAs), logicgates, graphics processing unit (GPU), a digital signal processor (DSP),etc.). As such, the tangible entities described herein may be referredto as circuits, circuitry, processor units, subsystems, or the like.

The camera system interface 302 may receive camera signal informationfrom a camera array, which may be installed on the system 300 or beremote, but communicatively coupled to the system 300. For instance, thecamera system interface 302 may receive raw video signal information orprocessed signal information (e.g., compressed video signals). Inanother aspect, the camera system interface 302 may be used to receivetagged information from a camera array, which analyzed and preprocessedthe raw video signal at, or near, the camera array.

Trigger detection unit 304 is capable of analyzing image or video datareceived by the camera system interface 302, and detect and identify agesture exhibited by movements performed by a person in the image orvideo data. The trigger detection unit 304 determines whether themovements constitute a recognized gesture. If the movements doconstitute a recognized gesture, the trigger detection unit 304 maytrigger operations performed by a robot.

To detect the gesture, the trigger detection unit 304 may access imagedata of an arm, finger, foot, or hand motion of a user captured by acamera system, and identify the gesture based on the image data. Theimage data may be a number of successive images (e.g., video) over whichthe gesture is performed.

In another aspect, the trigger detection unit 304 accesses depth imagedata of an arm, finger, foot or hand motion of the user and identifiesthe gesture based on the depth image data. The depth image data may be anumber of successive images (e.g., video) over which the gesture isperformed.

In another aspect, the trigger detection unit 304 operates independentfrom the camera system interface 302, and receives information or datathat indicates a gesture via a different pathway. For example, in anaspect, to detect the selection gesture, the trigger detection unit 304is to access motion data from an auxiliary device, the motion datadescribing an arm, finger, foot, or hand motion of the user and identifythe gesture based on the motion data. The auxiliary device may be amobile device, such as a smartphone, a wearable device, such as asmartwatch or a glove, or other type of device moveable by the user infree space. Examples include, but are not limited to, smartphones,smartwatches, e-textiles (e.g., shirts, gloves, pants, shoes, etc.),smart bracelets, smart rings, or the like.

The operation database interface 306 is used to determine whether thegesture is detected and recognized as being a triggering gesture. If itis a recognized triggering gesture, then commands are transmitted to arobot using the transceiver 308. The transceiver 308 may be configuredto transmit over various wireless networks, such as a Wi-Fi network(e.g., according to the IEEE 802.11 family of standards), cellularnetwork, such as a network designed according to the Long-Term Evolution(LTE), LTE-Advanced, 5G or Global System for Mobile Communications (GSM)families of standards, or the like. When the system 300 is incorporatedinto a robot, then the commands may be directly communicated to a robotcontroller by way of a wired connection.

Thus, the system 300 describes a vision-based robot control system, thesystem 300 comprising the camera system interface 302, the triggerdetection unit 304, and the transceiver 308. The camera system interface302 may be configured to receive image data from a camera system/

The trigger detection unit 304 may be configured to determine atriggering action from the image data.

The transceiver 308 may be configured to initiate a robot operationassociated with the triggering action. In an embodiment, to initiate therobot operation, the transceiver 308 is to transmit a command sequenceto a robot over a wireless network. In an embodiment, the robotoperation comprises a cleaning task.

In an embodiment, to receive the image data, the camera system interface302 is to receive an image of a user performing a gesture. In such anembodiment, to determine the triggering action, the trigger detectionunit 304 is to determine the triggering action corresponding to thegesture. In an embodiment, the gesture comprises pointing to a locationin an environment. In an embodiment, the gesture comprises tapping afoot on a location in an environment. In an embodiment, the triggeringaction is the gesture and the robot operation comprises performing extracleaning at the location in the environment.

In an embodiment, the image data includes a gesture performed by a user,and the trigger detection unit 304 is to receive a voice command issuedby the user and determine the triggering action using the gesture andthe voice command. In a further embodiment, the gesture is used tospecify a location in the environment, and the voice command is used tospecify an action to be taken by the robot.

In an embodiment, the image data includes a non-visible marking made bya user, the non-visible marking acting as the triggering action. In afurther embodiment, the non-visible marking comprises aninfrared-visible ink marking. In another embodiment, the non-visiblemarking is a symbol, and to initiate the robot operation, the operationdatabase interface 306 is used to access a lookup table and search forthe symbol to identify a corresponding robot operation.

In an embodiment, the trigger detection unit 304 is to obtain ageolocation associated with the triggering action and the transceiver308 is to initiate the robot operation to be performed at thegeolocation. In a further embodiment, the geolocation is obtained from adevice operated by a user performing the triggering action. In anotherembodiment, the geolocation is obtained from a gesture performed by theuser, the gesture captured in the image data.

FIG. 4 is a flowchart illustrating a method 400 for providing avision-based robot control system, according to an embodiment. At block402, image data from a camera system is received at a processor-basedrobot control system.

At block 404, a triggering action is determined from the image data.

At block 406, a robot operation associated with the triggering action isinitiated. In an embodiment, initiating the robot operation comprisestransmitting a command sequence to a robot over a wireless network.Various networks may be used, such as Bluetooth, Wi-Fi, or the like.

In an embodiment, the robot operation comprises a cleaning task. Otherrobotic tasks are understood to be within the scope of this disclosure.

In an embodiment, receiving the image data comprises receiving an imageof a user performing a gesture. In such an embodiment, determining thetriggering action comprises determining the triggering actioncorresponding to the gesture. The gesture may be any type of gesture, asdiscussed above, and may include actions such as pointing, tapping one'sfoot, or other similar gestures. Thus, in an embodiment, the gesturecomprises pointing to a location in an environment. In anotherembodiment, the gesture comprises tapping a foot on a location in anenvironment. In such embodiments, the triggering action is the gestureand the robot operation comprises performing extra cleaning at thelocation in the environment.

In an embodiment, the image data includes a gesture performed by a user.The user may contemporaneously issue a voice command to accompany thegesture and provide further parameters on the gesture command. Thus, insuch an embodiment, the method 400 includes receiving a voice commandissued by the user. Determining the triggering action then comprisesusing the gesture and the voice command to determine the triggeringaction.

In an embodiment, the gesture is used to specify a location in theenvironment, and the voice command is used to specify an action to betaken by the robot.

In an embodiment, the image data includes a non-visible marking made bya user, the non-visible marking acting as the triggering action. Thenon-visible marking may be infrared ink, as described above. Thus, in anembodiment, the non-visible marking comprises an infrared-visible inkmarking. Various words, symbols, or other indicia may be made with suchink and the robot control system may decipher the indicia and determinethe meaning of the command. Thus, in an embodiment, the non-visiblemarking is a symbol, and initiating the robot operation comprisessearching a lookup table for the symbol to identify a correspondingrobot operation. The lookup table may be administered by the user, or byan administrative person, such as the manufacturer or provider of therobot and related services.

In an embodiment, the method 400 includes obtaining a geolocationassociated with the triggering action and initiating the robot operationto be performed at the geolocation. The geolocation may be obtained invarious ways, such as with a mobile device that is able to determine aglobal positioning system (GPS) location (or similar), or by a camerasystem able to track and determine the user's location when the userperforms a gesture or leaves a marking. Thus, in an embodiment, thegeolocation is obtained from a device operated by a user performing thetriggering action. In another embodiment, the geolocation is obtainedfrom a gesture performed by the user, the gesture captured in the imagedata.

Embodiments may be implemented in one or a combination of hardware,firmware, and software. Embodiments may also be implemented asinstructions stored on a machine-readable storage device, which may beread and executed by at least one processor to perform the operationsdescribed herein. A machine-readable storage device may include anynon-transitory mechanism for storing information in a form readable by amachine (e.g., a computer). For example, a machine-readable storagedevice may include read-only memory (ROM), random-access memory (RAM),magnetic disk storage media, optical storage media, flash-memorydevices, and other storage devices and media.

A processor subsystem may be used to execute the instruction on themachine-readable medium. The processor subsystem may include one or moreprocessors, each with one or more cores. Additionally, the processorsubsystem may be disposed on one or more physical devices. The processorsubsystem may include one or more specialized processors, such as agraphics processing unit (GPU), a digital signal processor (DSP), afield programmable gate array (FPGA), or a fixed function processor.

Examples, as described herein, may include, or may operate on, logic ora number of components, modules, or mechanisms. Modules may be hardware,software, or firmware communicatively coupled to one or more processorsin order to carry out the operations described herein. Modules may behardware modules, and as such modules may be considered tangibleentities capable of performing specified operations and may beconfigured or arranged in a certain manner. In an example, circuits maybe arranged (e.g., internally or with respect to external entities suchas other circuits) in a specified manner as a module. In an example, thewhole or part of one or more computer systems (e.g., a standalone,client or server computer system) or one or more hardware processors maybe configured by firmware or software (e.g., instructions, anapplication portion, or an application) as a module that operates toperform specified operations. In an example, the software may reside ona machine-readable medium. In an example, the software, when executed bythe underlying hardware of the module, causes the hardware to performthe specified operations. Accordingly, the term hardware module isunderstood to encompass a tangible entity, be that an entity that isphysically constructed, specifically configured (e.g., hardwired), ortemporarily (e.g., transitorily) configured (e.g., programmed) tooperate in a specified manner or to perform part or all of any operationdescribed herein. Considering examples in which modules are temporarilyconfigured, each of the modules need not be instantiated at any onemoment in time. For example, where the modules comprise ageneral-purpose hardware processor configured using software; thegeneral-purpose hardware processor may be configured as respectivedifferent modules at different times. Software may accordingly configurea hardware processor, for example, to constitute a particular module atone instance of time and to constitute a different module at a differentinstance of time. Modules may also be software or firmware modules,which operate to perform the methodologies described herein.

Circuitry or circuits, as used in this document, may comprise, forexample, singly or in any combination, hardwired circuitry, programmablecircuitry such as computer processors comprising one or more individualinstruction processing cores, state machine circuitry, and/or firmwarethat stores instructions executed by programmable circuitry. Thecircuits, circuitry, or modules may, collectively or individually, beembodied as circuitry that forms part of a larger system, for example,an integrated circuit (IC), system on-chip (SoC), desktop computers,laptop computers, tablet computers, servers, smart phones, etc.

FIG. 5 is a block diagram illustrating a machine in the example form ofa computer system 500, within which a set or sequence of instructionsmay be executed to cause the machine to perform any one of themethodologies discussed herein, according to an example embodiment. Inalternative embodiments, the machine operates as a standalone device ormay be connected (e.g., networked) to other machines. In a networkeddeployment, the machine may operate in the capacity of either a serveror a client machine in server-client network environments, or it may actas a peer machine in peer-to-peer (or distributed) network environments.The machine may be a wearable device, personal computer (PC), a tabletPC, a hybrid tablet, a personal digital assistant (PDA), a mobiletelephone, or any machine capable of executing instructions (sequentialor otherwise) that specify actions to be taken by that machine. Further,while only a single machine is illustrated, the term “machine” shallalso be taken to include any collection of machines that individually orjointly execute a set (or multiple sets) of instructions to perform anyone or more of the methodologies discussed herein. Similarly, the term“processor-based system” shall be taken to include any set of one ormore machines that are controlled by or operated by a processor (e.g., acomputer) to individually or jointly execute instructions to perform anyone or more of the methodologies discussed herein.

Example computer system 500 includes at least one processor 502 (e.g., acentral processing unit (CPU), a graphics processing unit (GPU) or both,processor cores, compute nodes, etc.), a main memory 504 and a staticmemory 506, which communicate with each other via a link 508 (e.g.,bus). The computer system 500 may further include a video display unit510, an alphanumeric input device 512 (e.g., a keyboard), and a userinterface (UI) navigation device 514 (e.g., a mouse). In one embodiment,the video display unit 510, input device 512 and UI navigation device514 are incorporated into a touch screen display. The computer system500 may additionally include a storage device 516 (e.g., a drive unit),a signal generation device 518 (e.g., a speaker), a network interfacedevice 520, and one or more sensors (not shown), such as a globalpositioning system (GPS) sensor, compass, accelerometer, gyrometer,magnetometer, or other sensor.

The storage device 516 includes a machine-readable medium 522 on whichis stored one or more sets of data structures and instructions 524(e.g., software) embodying or utilized by any one or more of themethodologies or functions described herein. The instructions 524 mayalso reside, completely or at least partially, within the main memory504, static memory 506, and/or within the processor 502 during executionthereof by the computer system 500, with the main memory 504, staticmemory 506, and the processor 502 also constituting machine-readablemedia.

While the machine-readable medium 522 is illustrated in an exampleembodiment to be a single medium, the term “machine-readable medium” mayinclude a single medium or multiple media (e.g., a centralized ordistributed database, and/or associated caches and servers) that storethe one or more instructions 524. The term “machine-readable medium”shall also be taken to include any tangible medium that is capable ofstoring, encoding or carrying instructions for execution by the machineand that cause the machine to perform any one or more of themethodologies of the present disclosure or that is capable of storing,encoding or carrying data structures utilized by or associated with suchinstructions. The term “machine-readable medium” shall accordingly betaken to include, but not be limited to, solid-state memories, andoptical and magnetic media. Specific examples of machine-readable mediainclude non-volatile memory, including but not limited to, by way ofexample, semiconductor memory devices (e.g., electrically programmableread-only memory (EPROM), electrically erasable programmable read-onlymemory (EEPROM)) and flash memory devices; magnetic disks such asinternal hard disks and removable disks; magneto-optical disks; andCD-ROM and DVD-ROM disks.

The instructions 524 may further be transmitted or received over acommunications network 526 using a transmission medium via the networkinterface device 520 utilizing any one of a number of well-knowntransfer protocols (e.g., HTTP). Examples of communication networksinclude a local area network (LAN), a wide area network (WAN), theInternet, mobile telephone networks, plain old telephone (POTS)networks, and wireless data networks (e.g., Bluetooth, Wi-Fi, 3G, and 4GLTE/LTE-A or WiMAX networks). The term “transmission medium” shall betaken to include any intangible medium that is capable of storing,encoding, or carrying instructions for execution by the machine, andincludes digital or analog communications signals or other intangiblemedium to facilitate communication of such software.

Additional Notes & Examples

Example 1 is a vision-based robot control system, the system comprising:a camera system interface to receive image data from a camera system; atrigger detection unit to determine a triggering action from the imagedata; and a transceiver to initiate a robot operation associated withthe triggering action.

In Example 2, the subject matter of Example 1 optionally includeswherein to receive the image data, the camera system interface is toreceive an image of a user performing a gesture; and wherein todetermine the triggering action, the trigger detection unit is todetermine the triggering action corresponding to the gesture.

In Example 3, the subject matter of Example 2 optionally includeswherein the gesture comprises pointing to a location in an environment.

In Example 4, the subject matter of any one or more of Examples 2-3optionally include wherein the gesture comprises tapping a foot on alocation in an environment.

In Example 5, the subject matter of any one or more of Examples 3-4optionally include or 4, wherein the triggering action is the gestureand the robot operation comprises performing extra cleaning at thelocation in the environment.

In Example 6, the subject matter of any one or more of Examples 1-5optionally include wherein the image data includes a gesture performedby a user, and wherein the trigger detection unit is to receive a voicecommand issued by the user and determine the triggering action using thegesture and the voice command.

In Example 7, the subject matter of Example 6 optionally includeswherein the gesture is used to specify a location in the environment,and wherein the voice command is used to specify an action to be takenby the robot.

In Example 8, the subject matter of any one or more of Examples 1-7optionally include wherein the image data includes a non-visible markingmade by a user, the non-visible marking acting as the triggering action.

In Example 9, the subject matter of Example 8 optionally includeswherein the non-visible marking comprises an infrared-visible inkmarking.

In Example 10, the subject matter of any one or more of Examples 8-9optionally include wherein the non-visible marking is a symbol, and toinitiate the robot operation, an operation database interface is used toaccess a lookup table and search for the symbol to identify acorresponding robot operation.

In Example 11, the subject matter of any one or more of Examples 1-10optionally include wherein the trigger detection unit is to obtain ageolocation associated with the triggering action; and wherein thetransceiver is to initiate the robot operation to be performed at thegeolocation.

In Example 12, the subject matter of Example 11 optionally includeswherein the geolocation is obtained from a device operated by a userperforming the triggering action.

In Example 13, the subject matter of any one or more of Examples 11-12optionally include wherein the geolocation is obtained from a gestureperformed by the user, the gesture captured in the image data.

In Example 14, the subject matter of any one or more of Examples 1-13optionally include wherein to initiate the robot operation, thetransceiver is to transmit a command sequence to a robot over a wirelessnetwork.

In Example 15, the subject matter of any one or more of Examples 1-14optionally include wherein the robot operation comprises a cleaningtask.

Example 16 is a method of providing a vision-based robot control system,the method comprising: receiving, at a processor-based robot controlsystem, image data from a camera system; determining a triggering actionfrom the image data; and initiating a robot operation associated withthe triggering action.

In Example 17, the subject matter of Example 16 optionally includeswherein receiving the image data comprises receiving an image of a userperforming a gesture; and wherein determining the triggering actioncomprises determining the triggering action corresponding to thegesture.

In Example 18, the subject matter of Example 17 optionally includeswherein the gesture comprises pointing to a location in an environment.

In Example 19, the subject matter of any one or more of Examples 17-18optionally include wherein the gesture comprises tapping a foot on alocation in an environment.

In Example 20, the subject matter of any one or more of Examples 18-19optionally include or 19, wherein the triggering action is the gestureand the robot operation comprises performing extra cleaning at thelocation in the environment.

In Example 21, the subject matter of any one or more of Examples 16-20optionally include wherein the image data includes a gesture performedby a user, and wherein the method comprises receiving a voice commandissued by the user; and wherein determining the triggering actioncomprises using the gesture and the voice command to determine thetriggering action.

In Example 22, the subject matter of Example 21 optionally includeswherein the gesture is used to specify a location in the environment,and wherein the voice command is used to specify an action to be takenby the robot.

In Example 23, the subject matter of any one or more of Examples 16-22optionally include wherein the image data includes a non-visible markingmade by a user, the non-visible marking acting as the triggering action.

In Example 24, the subject matter of Example 23 optionally includeswherein the non-visible marking comprises an infrared-visible inkmarking.

In Example 25, the subject matter of any one or more of Examples 23-24optionally include wherein the non-visible marking is a symbol, andwherein initiating the robot operation comprises searching a lookuptable for the symbol to identify a corresponding robot operation.

In Example 26, the subject matter of any one or more of Examples 16-25optionally include obtaining a geolocation associated with thetriggering action; and initiating the robot operation to be performed atthe geolocation.

In Example 27, the subject matter of Example 26 optionally includeswherein the geolocation is obtained from a device operated by a userperforming the triggering action.

In Example 28, the subject matter of any one or more of Examples 26-27optionally include wherein the geolocation is obtained from a gestureperformed by the user, the gesture captured in the image data.

In Example 29, the subject matter of any one or more of Examples 16-28optionally include wherein initiating the robot operation comprisestransmitting a command sequence to a robot over a wireless network.

In Example 30, the subject matter of any one or more of Examples 16-29optionally include wherein the robot operation comprises a cleaningtask.

Example 31 is at least one machine-readable medium includinginstructions, which when executed by a machine, cause the machine toperform operations of any of the methods of Examples 16-30.

Example 32 is an apparatus comprising means for performing any of themethods of Examples 16-30.

Example 33 is an apparatus for providing a vision-based robot controlsystem, the apparatus comprising: means for receiving, at aprocessor-based robot control system, image data from a camera system;means for determining a triggering action from the image data; and meansfor initiating a robot operation associated with the triggering action.

In Example 34, the subject matter of Example 33 optionally includeswherein the means for receiving the image data comprise means forreceiving an image of a user performing a gesture; and wherein the meansfor determining the triggering action comprise means for determining thetriggering action corresponding to the gesture.

In Example 35, the subject matter of Example 34 optionally includeswherein the gesture comprises pointing to a location in an environment.

In Example 36, the subject matter of any one or more of Examples 34-35optionally include wherein the gesture comprises tapping a foot on alocation in an environment.

In Example 37, the subject matter of any one or more of Examples 35-36optionally include or 36, wherein the triggering action is the gestureand the robot operation comprises performing extra cleaning at thelocation in the environment.

In Example 38, the subject matter of any one or more of Examples 33-37optionally include wherein the image data includes a gesture performedby a user, and wherein the apparatus comprises means for receiving avoice command issued by the user; and wherein the means for determiningthe triggering action comprise means for using the gesture and the voicecommand to determine the triggering action.

In Example 39, the subject matter of Example 38 optionally includeswherein the gesture is used to specify a location in the environment,and wherein the voice command is used to specify an action to be takenby the robot.

In Example 40, the subject matter of any one or more of Examples 33-39optionally include wherein the image data includes a non-visible markingmade by a user, the non-visible marking acting as the triggering action.

In Example 41, the subject matter of Example 40 optionally includeswherein the non-visible marking comprises an infrared-visible inkmarking.

In Example 42, the subject matter of any one or more of Examples 40-41optionally include wherein the non-visible marking is a symbol, andwherein the means for initiating the robot operation comprise means forsearching a lookup table for the symbol to identify a correspondingrobot operation.

In Example 43, the subject matter of any one or more of Examples 33-42optionally include means for obtaining a geolocation associated with thetriggering action; and means for initiating the robot operation to beperformed at the geolocation.

In Example 44, the subject matter of Example 43 optionally includeswherein the geolocation is obtained from a device operated by a userperforming the triggering action.

In Example 45, the subject matter of any one or more of Examples 43-44optionally include wherein the geolocation is obtained from a gestureperformed by the user, the gesture captured in the image data.

In Example 46, the subject matter of any one or more of Examples 33-45optionally include wherein the means for initiating the robot operationcomprise means for transmitting a command sequence to a robot over awireless network.

In Example 47, the subject matter of any one or more of Examples 33-46optionally include wherein the robot operation comprises a cleaningtask.

The above detailed description includes references to the accompanyingdrawings, which form a part of the detailed description. The drawingsshow, by way of illustration, specific embodiments that may bepracticed. These embodiments are also referred to herein as “examples.”Such examples may include elements in addition to those shown ordescribed. However, also contemplated are examples that include theelements shown or described. Moreover, also contemplated are examplesusing any combination or permutation of those elements shown ordescribed (or one or more aspects thereof), either with respect to aparticular example (or one or more aspects thereof), or with respect toother examples (or one or more aspects thereof) shown or describedherein.

Publications, patents, and patent documents referred to in this documentare incorporated by reference herein in their entirety, as thoughindividually incorporated by reference. In the event of inconsistentusages between this document and those documents so incorporated byreference, the usage in the incorporated reference(s) are supplementaryto that of this document; for irreconcilable inconsistencies, the usagein this document controls.

In this document, the terms “a” or “an” are used, as is common in patentdocuments, to include one or more than one, independent of any otherinstances or usages of “at least one” or “one or more.” In thisdocument, the term “or” is used to refer to a nonexclusive or, such that“A or B” includes “A but not B,” “B but not A,” and “A and B,” unlessotherwise indicated. In the appended claims, the terms “including” and“in which” are used as the plain-English equivalents of the respectiveterms “comprising” and “wherein.” Also, in the following claims, theterms “including” and “comprising” are open-ended, that is, a system,device, article, or process that includes elements in addition to thoselisted after such a term in a claim are still deemed to fall within thescope of that claim. Moreover, in the following claims, the terms“first,” “second,” and “third,” etc. are used merely as labels, and arenot intended to suggest a numerical order for their objects.

The above description is intended to be illustrative, and notrestrictive. For example, the above-described examples (or one or moreaspects thereof) may be used in combination with others. Otherembodiments may be used, such as by one of ordinary skill in the artupon reviewing the above description. The Abstract is to allow thereader to quickly ascertain the nature of the technical disclosure. Itis submitted with the understanding that it will not be used tointerpret or limit the scope or meaning of the claims. Also, in theabove Detailed Description, various features may be grouped together tostreamline the disclosure. However, the claims may not set forth everyfeature disclosed herein as embodiments may feature a subset of saidfeatures. Further, embodiments may include fewer features than thosedisclosed in a particular example. Thus, the following claims are herebyincorporated into the Detailed Description, with a claim standing on itsown as a separate embodiment. The scope of the embodiments disclosedherein is to be determined with reference to the appended claims, alongwith the full scope of equivalents to which such claims are entitled.

1. (canceled)
 2. A control system for operation of a mobile machine,comprising: a sensor interface to receive image data from a camerasystem, wherein the image data captures an action of a user thatincludes a gesture; detection circuitry to recognize the gesture fromthe image data and identify an operation for the mobile machine based onthe recognized gesture; and a command interface to initiate theidentified operation with the mobile machine.
 3. The system of claim 2,wherein the gesture includes one or more motions provided from at leastone of: arm, hand, finger, or foot movements of the user.
 4. The systemof claim 3, wherein the image data includes depth image data, whereinthe camera system includes an infrared light camera to capture the depthimage data of the one or more motions, and wherein the gesture isrecognized based on three-dimensional information from the depth imagedata.
 5. The system of claim 2, wherein the action of the user furtherincludes a voice command, and wherein the identified operation isdetermined from information obtained from the recognized gesture incombination with information obtained from the voice command.
 6. Thesystem of claim 5, wherein the sensor interface is further to receivevoice data, the voice data including the voice command of the user,wherein the detection circuitry is further to recognize the voicecommand from the voice data, and wherein the operation for the mobilemachine is further identified based on the recognized voice command. 7.The system of claim 2, wherein the identified operation causes movementof the mobile machine.
 8. The system of claim 2, wherein the identifiedoperation causes performance of one or more actions by the mobilemachine.
 9. The system of claim 2, wherein the camera system and thecontrol system are integrated within the mobile machine.
 10. The systemof claim 2, wherein the identified operation of the mobile machinerelates to a location indicated by the action of the user.
 11. Amachine, comprising: a camera system to capture image data; and acontrol system to control operation of the machine, the control systemcomprising: a sensor interface to receive the image data from the camerasystem, wherein the image data captures an action of a user thatincludes a gesture; detection circuitry to recognize the gesture fromthe image data and identify an operation for the machine based on therecognized gesture; and a command interface to initiate the identifiedoperation with the machine.
 12. The machine of claim 11, wherein thegesture includes one or more motions provided from at least one of: arm,hand, finger, or foot movements of the user.
 13. The machine of claim12, wherein the image data includes depth image data, wherein the camerasystem includes an infrared light camera to capture the depth image dataof the one or more motions, and wherein the gesture is recognized basedon three-dimensional information from the depth image data.
 14. Themachine of claim 11, wherein the action of the user further includes avoice command, and wherein the identified operation is determined frominformation obtained from the recognized gesture in combination withinformation obtained from the voice command.
 15. The machine of claim14, wherein the sensor interface is further to receive voice data, thevoice data including the voice command of the user, wherein thedetection circuitry is further to recognize the voice command from thevoice data, and wherein the operation for the machine is furtheridentified based on the recognized voice command.
 16. The machine ofclaim 11, wherein the identified operation causes movement of themachine.
 17. The machine of claim 11, wherein the identified operationcauses performance of one or more actions by the machine.
 18. Themachine of claim 11, wherein the identified operation of the machinerelates to a location indicated by the action of the user.
 19. At leastone machine-readable storage medium comprising instructions storedthereupon, which when executed by circuitry of a control system for amobile machine, cause the circuitry to perform operations comprising:obtaining image data captured by a camera system, wherein the image datacaptures an action of a user that includes a gesture; recognizing thegesture from the image data; identifying an operation for the mobilemachine based on the recognized gesture; and initiate the identifiedoperation with the mobile machine.
 20. The machine-readable storagemedium of claim 19, wherein the gesture includes one or more motionsprovided from at least one of: arm, hand, finger, or foot movements ofthe user.
 21. The machine-readable storage medium of claim 20, whereinthe image data includes depth image data, wherein the camera systemincludes an infrared light camera to capture the depth image data of theone or more motions, and wherein the gesture is recognized based onthree-dimensional information from the depth image data.
 22. Themachine-readable storage medium of claim 19, wherein the action of theuser further includes a voice command, and wherein the identifiedoperation is determined from information obtained from the recognizedgesture in combination with information obtained from the voice command.23. The machine-readable storage medium of claim 22, the operationsfurther comprising: obtaining voice data including the voice command ofthe user; recognizing the voice command from the voice data; and whereinthe operation for the mobile machine is further identified based on therecognized voice command.
 24. The machine-readable storage medium ofclaim 19, wherein the identified operation causes movement of the mobilemachine.
 25. The machine-readable storage medium of claim 19, whereinthe identified operation causes performance of one or more actions bythe mobile machine.
 26. The machine-readable storage medium of claim 19,wherein the camera system and the control system are integrated withinthe mobile machine.
 27. The machine-readable storage medium of claim 19,wherein the identified operation of the mobile machine relates to alocation indicated by the action of the user.